Structural transitions in scale-free networks 
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Real growing networks like the WWW or personal connection based networks are characterized by 
a high degree of clustering, in addition to the small-world property and the absence of a characteristic 
scale. Appropriate modifications of the (Barabasi-Albert) preferential attachment network growth 
capture all these aspects. We present a scaling theory to describe the behavior of the generalized 
models and the mean field rate equation for the problem. This is solved for a specific case with 
the result C(k) ~ 1/k for the clustering of a node of degree k. Numerical results agree with such a 
mean-field exponent which also reproduces the clustering of many real networks. 
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In diverse fields of scientific interest underlying net- 
work structures can be recognized, which provide a unify- 
ing concept of investigation JpJ . Examples range from bi- 
ology (metabolic networks protein nets in the cell 
through sociology (movie actor relationships 0], coau- 
thor networks H, sexual nets [[| ) to informatics (Inter- 
net 0, WWW @). In all these examples it is easy to 
identify the constituents of the problem with the nodes 
of a graph and their relationships with directed or undi- 
rected links. During the last few years a great deal of in- 
formation has accumulated about such structures. Three 
apparent features seem to characterize them rather ro- 
bustly: i) a high degree of clustering, i.e., if nodes A and 
B are linked to node C then there is a good chance that 
A and B are also linked; ii) the "Small World" prop- 
erty, i.e., the expected number of links needed to reach 
from one arbitrarily selected node another one is low; iii) 
the absence of a characteristic scale, which often appears 
so that the distribution P(k) of the degrees k of nodes 
follows a power law. 

It has been noticed from the beginning that clustering 
in real networks is an essential and an almost ubiquitous 
feature. It measures the deviation from a structure with 
vanishing correlations, and it has been used to describe 
the tendency of networks to form cliques or tightly con- 
nected neighborhoods. As an organizing principle, this 
is most obvious in social networks, where connections 
are usually created by personal acquaintances, like in the 
scientific collaboration network. Considerable clustering 
has also been found in networks of more diverse nature. 
Prime examples are the WWW, metabolic and protein 
interaction networks, the actor network, the power grid of 
the United States, the semantic web of English words 0, 
and the backbone of the Internet on both the autonomous 
system and the router level Jl(| [TTJ . The number of en- 
tries in this list is on the rise as new disciplines are being 
taken under consideration and raw data are made avail- 
able. A comprehensive examination of a variety of real 
networks clustering can be found in Ref. 0] . In real net- 



works, as a combination of the properties i) and iii), the 
clustering coefficient as a function of the degree of the 
nodes often follows a power law: C(k) oc k~ a . The value 
of a is in many networks close to 1 . 

In 1998 Watts and Strogatz created an interesting fam- 
ily of models: introducing a rather low portion of ran- 
dom links between arbitrarily selected pairs of nodes in 
a regular lattice has the consequence that property ii) 
gets fulfilled while clustering does not decrease consid- 
erably, assuring i) jl2| However, the distribution of 
the degrees of nodes shows a characteristic peak instead 
of the required power law. Barabasi and Albert (BA) re- 
alized that in the examples mentioned at the beginning 
an important aspect is that the networks are created by 
growth. BA proposed preferential attachment (PA) as a 
growth rule: the new nodes are linked to the old ones 
with a probability proportional to their the actual de- 
gree The structures obtained this way are scale free 
and have the Small World property. In spite of captur- 
ing important aspects of growing networks, the clustering 
tends rapidly to a constant as a function of the degree k 
and vanishes in the thermodynamic limit. 

Recently, attempts have been undertaken to modify 
the PA network growth models so as to increase cluster- 
ing. In these models a mechanism, controlled by a new 
parameter, is introduced to take into account the effect 
that "friends of friends get friends" . Indeed, it has been 
possible to create models which have all the three prop- 
erties i)-iii) §,00. 

The aim of this paper is to present a general framework 
for the transition from a PA graph with zero clustering 
to still scale free graphs with C(k) oc k~ a , and to give a 
corresponding mean-field (MF) and rate-equation theory. 
As an example we will take the Holme-Kim model 
(a modified BA one) for which the MF rate equations 
can be solved exactly, leading to a = 1. This is also 
shown to describe the simulations very well. At the end, 
we discuss the assumptions one needs to make, and how 
these reflect the behavior of clustering in general. 
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We start from the simplest undirected BA model: a 
new node j with to links is added to the system at (dis- 
crete) time t. A link from node j to node i is drawn with 
probability ki/^ki. It is known that the the average 
clustering at node i is independent of the degree ki Jig]: 



C(ki) 



to - 1 (log AO 2 



N 



(1) 



i.e., it is inversely proportional to the number N of nodes 
(with a logarithmic correction) |M. For the generaliza- 
tion of the BA model with enhanced clustering, we have 
a parameter p representing an imposed tendency to form 
triangles on the graph. It is chosen such that at p = 
the original BA model is recovered. 

We propose as a scaling ansatz to describe the cluster- 
ing coefficient C as a function of the degree k, the number 
of nodes N and the parameter p: 



C(k,N,p) = N- 1 / ( — - — - 
K 1 > FI J \k*{N,p) 



(2) 



where f(x) is a scaling function with f{x) — > const, for 
x 3> 1 and f{x) — > x~ a for x <§; 1 and the behavior 
in Eq. ([!]) is already taken into account by fixing the 
exponent of the prefactor of /. The characteristic degree 
k* is a monotonously increasing function of N for fixed 
p and it should decrease as p goes to zero. A natural 
assumption is then: 



k*(N,p) ~ ivy. 



(3) 



As for small k the clustering C in Eq. (g) should go like 
k~ a and become independent of A, we have 7 = 1/a. 
The exponent 5a describes, how for N — > 00 the clus- 
tering C approaches its limiting value zero as p goes to 
zero. If we accept that in most cases a = 1, there is one 
exponent to be determined, say 6. We now clarify the 
origin of a = 1 and (5 = 1 for the model employed. 

For this purpose we write down the rate equations for 
the clustering in a general form. We thus need to consider 
the average rate of change 



d t m = R(ki,p) 2J R{k ni p), 



(4) 



where rii is the number of connected neighbors of site i, 
and Ci = i%i/(ki(ki — l)/2). Here R is the rate at which 
i gets new links, and we allow, in analogy with the scal- 
ing ansatz presented above, the rate to depend on both 
the degrees of the node in question and the parameter 
p. This can be "annealed" or "quenched" , depending on 
whether the parameter describes stochastic rules (as in 
the example below) or a fixed property of each node i. 
E.g., R can simply follow from the preferential attach- 
ment rule, ft is the set of neighbors of node i and the 
sum caters for the probability that a new node linked to 
i also links to one of the neighbors of i. This increases rn 
and enhances clustering. In order to make Eq. (G ) more 
concrete we discuss the triad formation model [14] as an 
example. 






FIG. 1: Three different options to connect to node i with 
m > 2. In (I), a PA step is performed first linking to i and 
then a TF step creates a link between neighbors of i. In (II), 
the same happens, in a different order. (Ill) shows how two 
PA steps may contribute to rii. Bold edges increase rii. 



The complications in solving a rate equation like 
Eq. (||) arise from the correlations that are embedded 
between the degree of node i and the properties of its 
neighborhood. For the triad formation model, the rules 
consist of a BA model extended by a triad formation 
step. Initially, the network contains mo vertices and no 
edges, and in every time step a new vertex is added with 
to undirected edges. The to edges are then one-by-one 
subsequently linked to m different nodes in the network. 
One performs a preferential attachment step for the first 
edge as defined in the BA model. With probability p, the 
second and further edges edges are joined to a randomly 
chosen neighbor of the node selected in the previous PA 
step. Alternatively, with probability 1 — p, a PA step is 
performed again. 

In the limit when p approaches zero, one recovers the 
original BA model, and by setting p to a value between 
and 1 the average clustering can be adjusted continu- 
ously and grows monotonously with an increasing p. The 
microscopic mechanisms that increase rn are illustrated 
in Figure [l] and are (I) : the new node connects to node i 
in a PA step, which is potentially followed by several TF 
steps; (II): the new node connects to one of the neighbors 
of i in a PA step and then i conversely gets linked to the 
new node in one of the subsequent TF steps; (III): the 
new node connects to node i in a PA step and a neighbor 
of i is also selected for connection to the new node in 
another PA step. 

Using these for R(ki,p), the rate equation for m reads 

ki v— v k n 1 

dtUi = m PA — — to tf + m PA - — 7 —rriTF 



2mt 
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2mt k„ 



rap & — — (mpA — 1) — — . 
2mt V ' ^ 2mt 



(5) 



The first term in the sum gives the increase in rii 
by mechanism (I). mpA is the number of PA steps at- 
tempted per each new node (recall that per time-unit 
one new node is added). ki/(2mt) is the preferential at- 
tachment probability to node i. rriTF is the expected 
number of triad formation steps that take place on the 
average after a single PA step. Given this, we have that 
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ITipA + mpAlTlTF = to. 

The second term describes mechanism (II); in this 
term, the sum goes over all neighbors fl of i, and their 
degrees are denoted by k n . 1 / k n comes from the fact that 
the neighboring node where a TF step links is chosen uni- 
formly from the neighbors. We exclude here secondary 
triangle formation, that takes place if two TF steps from 
the new node form a triangle with i and one of z's neigh- 
bors. This becomes more relevant for large p's. The term 
for (II) gives the same expression as (I) after simplifica- 
tion. 

The last term belongs to (III) and it is the only one 
that would remain if we considered the simple B A model. 
It is the product of the probabilities of linking to node i 
and to one of the neighbors of i, respectively, using only 
PA steps. The term contains the sum of the degrees of 
neighboring nodes; this is ki times the average degree of 
the neighbors. It has been shown that for uncorrelated 
random BA networks (k n ) = ^ logf = f logf Q. In 
this model the numerical result follows the same scaling 
not only for p <C 1 but for p general. 

Finally, we get rij at the end of the network growth by 
integrating both sides of Eq. (||). The integral for term 
(I) or (II) is simply 



m pa ttitf dt — 



2mt 

mpATTlTF 



N 



dki 
~~dt 



dt = 



TTLpATTlTF 



h(N), (6) 



m m 

where we made use of the fact that d t ki = 
hi /(2b) @. From this, it also follows that h(t) = 
m(t/i) 1 '*aa.d thus integrating (III) gives [mpA(mpA — 
l)/16m][(logA^) 2 /A^]fcf. Combining this with Eq. (§) 
yields 



2mp A m T F , 
TH = rii,o H k 



mp A (mpA - 1) (log TV) 2 



m 16m N 

The local clustering coefficient for node i becomes 



kl(7) 



Ci (ki 



Am 



TF 



to - 1 (log N) 2 



h(h - l)/2 



N 



(8) 



after neglecting m t o and approximating mpA by to, which 
is reasonable when the triad formation probability is 
small. It is not surprising that the constant offset in 
the expression of Ci is for p — > exactly the constant 
clustering coefficient of pure BA graphs. The first term, 
more importantly, can be attributed to the triad forma- 
tion induced clustering, and shows the 1/k behavior typ- 
ical of many real networks and other models ^ p"8| . 
Ci is composed of a power law and a constant, so per- 
fect power-law behavior follows only when the former one 
dominates. In the opposite case an effective exponent 
will be less than 1. Furthermore, since rx^o has been ne- 
glected, Eq. (|^) and the inverse proportionality apply to 
nodes with ki large enough, only. 

For further progress TOtf, the expected number of 
links created in the several possible TF steps after a PA 
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FIG. 2: Clustering coefficient as a function of the node de- 
gree for m = 5 and different sizes (10 4 for o, 25119 for +, 
63096 for *, 158489 for □, 398107 for O, and 10 6 for A). 
The triad formation probability is uniformly p = 0.01. The 
bold line is a least-square fit to the largest system and gives 
C « 0.028 fc -0 ' 97 + 9.9 ■ 10" 5 , where the prediction is that 
C ft! 0.02 + 9.5 ■ 10" 5 . The inset shows the data collapse 
of the power- law part of C(k). 



step for a particular node, needs to be approximated. 
Take to — 1 edges to be available for successive TF steps 
(this is an upper limit) and assume node i is not satu- 
rated yet as far as the connections to the neighbors are 
concerned. This gives mxF = Z V Z (\ — p) + (m — 

l)p m_1 w p for p small. 

The fact that the local clustering coefficient contains a 
constant term means that there is a crossover at a certain 
k* . At this point, a power law turns over to a constant 
clustering coefficient, k* can be estimated by taking the 
two terms in Eq. (H) to be equal: 



32 



■pN. 



TO(logiV) 2 *"" ^ 

Thus we can conclude that the exponents of Eq. (|^) are 
7 = \/ol = 1 and 8 — 1 for the triad formation model, 
and from above, a — 1. E.g., in the case of Figure ||, 
taking N = 10 6 yields k* w 400. 

Simulations of the model consistently confirm the an- 
alytical results obtained from the rate equation. In Fig- 
ure |^ networks of different sizes are shown to undergo 
such a transition to constant clustering by tuning p so 
that k* is smaller than the maximum degree in the net- 
works. A similar phenomenon to the transition described 
above can be observed in the case of the actor network 
of the IMDB database ||, where the tail of a decreasing 
power law becomes constant, although large fluctuations 
naturally affect this part of the statistics. Figure || shows 
networks well below the transition and thus almost only 
the power-law part is conceivable. 

It is not unusual in the physics of scale free networks 
that mean-field approaches work well [EJ . This fact is re- 
lated to the strongly hierarchical nature of the networks 
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FIG. 3: Clustering coefficient for networks of 10 6 nodes and 
m — 5; the triad formation probability is p — 0.2, 0.4, 0.6, 
0.8, and 1, for o, + , *, □, and O, respectively. The fit of 
Eq. (^) for p — 1 gives C ~ 5.28 k" 1 , whereas the relation is 
expected to be C » 3.2 k' 1 + 9.5 ■ 10" 5 . 

grown by preferential attachment and our study demon- 
strates that this situation remains unaltered even when 
considering a mechanism which enhances clustering. The 
agreement between the l/k a dependence with a = 1 ob- 
tained in Eq. (^) and that found in real networks indi- 
cates that the same "mean-field" mechanisms of cluster- 
ing are operative. For PA growth with enhanced cluster- 
ing the simplest interpretation is that for each new link a 
node i gains from a new node introduced to the network, 
its neigbors ("friends") have also a constant probability 
to be linked to the same new one. 



It is interesting to ask how robust the mean-held ex- 
ponent is and what are the limits of the above approach, 
especially in the light of the recently discovered networks 
with a ^ 1 . The rate equations allow to discuss the 
ways how exponents like such can emerge. Eq. (Q) implies 
that the clustering is crucially dependent on the proper- 
ties of the nodes in the neighborhood, £1. If, say, correla- 
tions from "assortative" or "disassortative" mixing arise 
between ki and the average degree (k n ) [n G f2) [^0], this 
may either enhance (a < 1) or inhibit (a > 1) clustering 
from the mean-held result. On the level of models, one 
can envision changing the k- and p-dependence of the 
rates. The second possibility is fluctuation effects that 
limit the validity of the rate-equation theory. It would 
seem to be of interest to explore both these issues. 

In conclusion, we have formulated a scaling assumption 
and a mean-held theory of the clustering of scale- free net- 
works. A specific example, the triad formation model has 
been solved and comparisons to the simulations indicate 
both good agreement and yield the MF- value of the ex- 
ponent a. This approach should be amenable to many 
of the models in the literature, and help in understand- 
ing the origins of the statistical properties of clustering, 
also beyond the C(fc)-function, both in models and in 
the many real- life examples of networks. We have here 
considered only growing networks, but obviously the rate 
equations can be written down also in the case the struc- 
tural dynamics allows for deleting edges, as well pll p2[ . 
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